36 research outputs found

    Graphonological Levenshtein Edit Distance: Application for Automated Cognate Identification

    Get PDF
    This paper presents a methodology for calculating a modified Levenshtein edit distance between character strings, and applies it to the task of automated cognate identification from non-parallel (comparable) corpora. This task is an important stage in developing MT systems and bilingual dictionaries beyond the coverage of traditionally used aligned parallel corpora, which can be used for finding translation equivalents for the ‘long tail’ in Zipfian distribution: low-frequency and usually unambiguous lexical items in closely-related languages (many of those often under-resourced). Graphonological Levenshtein edit distance relies on editing hierarchical representations of phonological features for graphemes (graphonological representations) and improves on phonological edit distance proposed for measuring dialectological variation. Graphonological edit distance works directly with character strings and does not require an intermediate stage of phonological transcription, exploiting the advantages of historical and morphological principles of orthography, which are obscured if only phonetic principle is applied. Difficulties associated with plain feature representations (unstructured feature sets or vectors) are addressed by using linguistically-motivated feature hierarchy that restricts matching of lower-level graphonological features when higher-level features are not matched. The paper presents an evaluation of the graphonological edit distance in comparison with the traditional Levenshtein edit distance from the perspective of its usefulness for the task of automated cognate identification. It discusses the advantages of the proposed method, which can be used for morphology induction, for robust transliteration across different alphabets (Latin, Cyrillic, Arabic, etc.) and robust identification of words with non-standard or distorted spelling, e.g., in user-generated content on the web such as posts on social media, blogs and comments. Software for calculating the modified feature-based Levenshtein distance, and the corresponding graphonological feature representations (vectors and the hierarchies of graphemes’ features) are released on the author’s webpage: http://corpus.leeds.ac.uk/bogdan/phonologylevenshtein/. Features are currently available for Latin and Cyrillic alphabets and will be extended to other alphabets and languages

    Unsupervised Induction of Ukrainian Morphological Paradigms for the New Lexicon: Extending Coverage for Named Entities and Neologisms using Inflection Tables and Unannotated Corpora

    Get PDF
    The paper presents an unsupervised method for quickly extending a Ukrainian lexicon by generating paradigms and morphological feature structures for new Named Entities and neologisms, which are not covered by existing static morphological resources. This approach addresses a practical problem of modelling paradigms for entities created by the dynamic processes in the lexicon: this problem is especially serious for highly-inflected languages in domains with specialised or quickly changing lexicon. The method uses an unannotated Ukrainian corpus and a small fixed set of inflection tables, which can be found in traditional grammar textbooks. The advantage of the proposed approach is that updating the morphological lexicon does not require training or linguistic annotation, allowing fast knowledge-light extension of an existing static lexicon to improve morphological coverage on a specific corpus. The method is implemented in an open-source package on a GitHub repository. It can be applied to other low-resourced inflectional languages which have internet corpora and linguistic descriptions of their inflection system, following the example of inflection tables for Ukrainian. Evaluation results shows consistent improvements in coverage for Ukrainian corpora of different corpus types

    MoBiL: A hybrid feature set for Automatic Human Translation quality assessment

    Get PDF
    In this paper we introduce MoBiL, a hybrid Monolingual, Bilingual and Language modelling feature set and feature selection and evaluation framework. The set includes translation quality indicators that can be utilized to automatically predict the quality of human translations in terms of content adequacy and language fluency. We compare MoBiL with the QuEst baseline set by using them in classifiers trained with support vector machine and relevance vector machine learning algorithms on the same data set. We also report an experiment on feature selection to opt for fewer but more informative features from MoBiL. Our experiments show that classifiers trained on our feature set perform consistently better in predicting both adequacy and fluency than the classifiers trained on the baseline feature set. MoBiL also performs well when used with both support vector machine and relevance vector machine algorithms

    Вплив фізичних навантажень на психофізіологічний стан студентів першого курсу факультету ветеринарної медицини

    Get PDF
    The goal of physical education in schools is to facilitate the preparation harmoniously developed, highly qualified specialists. The training course in physical education provides the following tasks: training students in high moral, volitional and physical qualities; readiness for effective work; the preservation and promotion of health; promoting good formation and the comprehensive development of the body; maintain high efficiency over the entire study period. The role and importance of physical culture and sports is:  - to reduce terms of professional adaptation, professional skills, productivity, resistance to unfavorable environment; - maintaining health and reducing accidents, improving professional development and motor skills important in order to maximize the expansion and deepening of motor capacity and motor experience regarding mastering the profession; - improve professionally important physiological functions for the professional development of and resistance to unfavorable working environment.Метою фізичного виховання у навчальних закладах є сприяння підготовці гармонійно розвинених, висококваліфікованих фахівців. У процесі навчання по курсу фізичного виховання передбачається вирішення таких завдань: виховання у студентів високих моральних, вольових і фізичних якостей, готовності до високопродуктивної праці, збереження та зміцнення здоров’я, сприяння правильному формуванню й усебічному розвитку організму, підтримки високої працездатності впродовж усього періоду навчання. Роль та значення фізичної культури і спорту полягає в:  - скороченні термінів професійної адаптації, підвищенні професійної майстерності, продуктивності праці, стійкості до несприятливих факторів виробничого середовища; - збереженні здоров’я і зменшенні травматизму, формуванні та вдосконаленні професійно важливих рухових навичок з метою максимального розширення і поглиблення рухових можливостей, рухового досвіду у засвоєнні професії; - вдосконаленні професійно важливих психофізіологічних функцій організму для підвищення професійного рівня і стійкості до несприятливих факторів виробничого середовища

    Формування спеціальної працездатності майбутніх технологів харчової промисловості в процесі занять з фізичного виховання

    Get PDF
    Physical education in the higher education system should be based on new teaching technologies, which will ensure the physical and psychophysiological readiness of students to fulfill their professional responsibilities. In the current conditions of scientific and technological progress, profiling physical education of students in higher education institutions of Ukraine becomes important. Preserving and promoting the health of the student youth, forming in it the readiness for professional activity, national – patriotic and cultural – spiritual education, the need for physical perfection and a healthy lifestyle is one of the tasks of physical education. At the same time, noting the scientific and practical interest in the problem of physical training of students, it can be noted that this trend in the system of physical education requires further study and improvement. Therefore, finding ways to improve the effectiveness of physical education with students in higher education institutions in Ukraine is one of the priority areas of scientific research in the field of physical education.Фізичне виховання в системі вищої освіти повинно спиратися на нові технології викладання, що дозволить забезпечити фізичну та  психофізіологічну готовність студентів до виконання своїх професійних обов’язків. У сучасних умовах розвитку науково-технічного прогресу важливого значення набуває профілююча фізична підготовка студентів у закладах вищої освіти України. Збереження та зміцнення здоров’я студентської молоді, формування в неї готовності до професійної діяльності, національно-патріотичної та культурно-духовної освіти, потреби у фізичному вдосконаленні та здоровому способі життя є одним із завдань фізичного виховання. Разом з тим, враховуючи науковий і практичний інтерес до проблеми  фізичної підготовки студентів, зазначимо, що цей напрямок в системі фізичного виховання вимагає подальшого вивчення та вдосконалення. Тому пошук шляхів підвищення ефективності занять фізичною культурою зі студентами у закладах вищої освіти України є одним із приорітетних напрямків наукових досліджень в галузі фізичної культури

    Information retrieval and text mining technologies for chemistry

    Get PDF
    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.A.V. and M.K. acknowledge funding from the European Community’s Horizon 2020 Program (project reference: 654021 - OpenMinted). M.K. additionally acknowledges the Encomienda MINETAD-CNIO as part of the Plan for the Advancement of Language Technology. O.R. and J.O. thank the Foundation for Applied Medical Research (FIMA), University of Navarra (Pamplona, Spain). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia), and FEDER (European Union), and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684). We thank Iñigo Garciá -Yoldi for useful feedback and discussions during the preparation of the manuscript.info:eu-repo/semantics/publishedVersio
    corecore